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[Document Name] Specification 



[Title of Invention] Human Galectin-9-Like Protein and cDNA For Encoding the Same 
[Claims] 

[Claim 1] A protein including the amino acid sequence expressed by Sequence No. 1. 

[Claim 2] The protein described in Claim 1, wherein the protein includes the amino acid 
sequence expressed by Sequence No. 2. 

[Claim 3] cDNA including the base sequence expressed by Sequence No. 3. 

[Claim 4] The cDNA described in Claim 3, wherein the cDNA includes the base 
sequence expressed by Sequence No. 4. 

[Claim 5] The cDNA described in Claim 3 and Claim 4, wherein the cDNA includes the 
base sequence expressed by Sequence No. 5. 

[Detailed Explanation of the Invention] 
[0001] 

[Industrial Field of Application] 

The present invention relates to a human galectin-9-like protein and cDNA for encoding 
the same The protein of the present invention can be used as a medicine or as a 
reagent for sugar chain research. The human cDNA of the present invention can be 
used as a genetic diagnostics probe or the gene source for gene therapy. The cDNA 
can be used as the gene source for mass production of encoded proteins. 

[0002] 
[Prior Art] 

Galectins are animal lectins bonded to galactose. Animal lectins are located in various 
places such as cytoplasm, nuclei and cytomembranes. They are believed to contribute 
to cell proliferation, differentiation, canceration, transference and immunization 
[Drickamer, K., Annu. Rev. Cell Biol. 9: 237-264 (1993)]. Nine galectins (galectin-1 
through galectin-9) are currently known. 

[0003] 

Galectin-9 is a lectin that was identified as the antigen protein that reacts with the 
antibodies contained in the serum of patients with Hodgkin's Lymphoma [Tureci, O., J. 
Biol. Chem. 272: 6418-6422 (1997)]. Galectin-9, like galectin-4 and galectm-8, have a 
structure in which two sugar chain bond domains are linked by a linker peptide. Its role 
in the body is not yet completely understood, but it is believed to contribute to adhesion 
between cells Two types of galectin-9 with different molecular weights have been 
reported in mice [Wada, J. and Kanwar, Y.S., J. Biol. Chem. 272: 6078-6086 (1997)], but 
different isoforms in human beings have not been reported. 
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[0004] 



[Problem Solved by the Invention] 

The purpose of the present invention is to provide a human galectin-9-like protein and 
cDNA for encoding the same. 

[0005] 

[Means of Solving the Problem] 

As a result of extensive research, the present inventors were able to clone human cDNA 
for encoding galectin-9-like proteins. The present invention is the product of this 
discovery In other words, the present invention provides a galectin-9-like protein, and 
proteins containing the amino acid sequences expressed by Sequence No^ 1 and 
Sequence No 2 The present invention also provides cDNA containing the base 
sequences expressed in Sequence No. 3 through Sequence No. 5 for encoding these 
proteins. 

[0006] 

[Embodiment of the Invention] 

The proteins in the present invention can be obtained by isolating them from human 
organs and cell lines, preparing peptides through chemical synthesis based on the 
amino acid sequences in the present invention, or producing them through recombinant 
DNA technology using DNA for encoding the human galectin-9-like proteins of the 
present invention Ideally, they should be obtained using recombinant DNA technology. 
For example RNA can be prepared through an in vitro transfer from a vector with the 
cDNA of the present invention, followed by in vitro expression through in vitro translation 
using the RNA as the template. If the translated section is recombmed in an 
appropriately expressed vector using a method well known in the art, proteins encoded 
with E.coli, Bacillus subtilis, yeast and animal cells can be expressed in large amounts. 

[0007] 

[Means of Solving the Problem] 

When a protein of the present invention is expressed by a microorganism such as E.coli, 
an expression vector is prepared in which the translated section of the cDNA of the 
present invention is recombined with an expression vector having an origin, promoter, 
ribosome bonding site, cDNA cloning site and terminator replicatable in the 
microorganism. If, after transforming host cells using this expression vector, the 
transformants are cultivated, the proteins with the encoded cDNA can be mass produced 
in the microorganism. This can also be expressed as a fused protein with another 
protein The protein portion with the encoded cDNA can be obtained by breaking the 
fused protein using the appropriate protease. If there is lactose bonding activity, the 
fused protein is considered a protein of the present invention. 
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[0008] 

If a protein of the present invention can be secreted and expressed by an animal cell, 
the translated section of the cDNA is recombined with an animal cell expression vector 
with a promoter, splicing section and a poly (A) addition site. If this is introduced to 
animal cells, the protein of the present invention can be secreted and expressed outside 
of the cells. 



[0009] 

[Preferred Embodiment of the Invention] 

The proteins of the present invention include peptide fragments (with 5 or more amino 
acid residues) including a partial amino acid sequence from the amino acid sequence 
expressed by Sequence No. 1. These peptide fragments can be used as antigens to 
manufacture antibodies. The proteins of the present invention can be excreted outside 
of cells Because there are sites in the amino acid sequence where sugar chains can 
be bonded a protein with an added sugar chain can be obtained if expressed in the 
appropriate animal cell. Therefore, these peptides and proteins with added sugar chains 
are considered proteins of the present invention. 



[0010] 

The DNA of the present invention includes all DNA that encodes these proteins. The 
DNA can be obtained using the chemical synthesis method and the cDNA cloning 
method. 



[0011] 

The human cDNA of the present invention can be cloned from a cDNA library derived 
from human cells. Here, poly (A) +RNA extracted from the human cells in the cDNA 
library is manufactured as the template. The human cells are any cells that can be 
removed from human beings during surgery and cultivated. In this embodiment, poly (A) 
+RNA isolated from stomach cancer tissue is used. The cDNA synthesis can be 
performed using the Okayama-Berg Method [Okayama, H. and Berg, P., Mol. Cell. Biol. 
2161-170 (1982)] or the Gubler-Hoffman Method [Gubler, U. and Hoffman, J. Gene 25: 
263-269 (1983)] In order to effectively perform full-length cloning, the Capping Method 
[Kato S et al Gene 150: 243-250 (1994)] should be performed. The cDNA can be 
identified by determining all of the base sequences using sequencing, searching the 
existing proteins for amino acid sequences or analogous sequences anticipated by the 
base sequences, expressing proteins using in vitro translation, expressing proteins using 
E.coli, and measuring the activity of the expressed products. The activity measurement 
is performed by confirming bondability with lactose. 



[0012] 

The cDNA of the present invention is characterized by the inclusion of base sequences 
expressed by Sequence No. 3 and Sequence No. 4. The expression of Sequence No. 5 
had a base sequence at 1725 bp and an open reading frame at 1068 bp. This open 
reading frame encodes proteins with 355 amino acid residues. This protein bears a high 
(69 3%) resemblance to mouse galectin-9 isoforms at the amino acid sequence level. 
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[0013] 



Clones identical to the cDNA of the present invention can be obtained by screening a 
human cDNA library prepared from human cells using an oligonucleotide probe 
synthesized based on the cDNA base sequence described in Sequence No. 3. 

[0014] 

Polymorphisms are frequently found in human genes due to individual differences. 
Therefore, cDNA consisting of one or more nucleotides added, deleted and/or 
substituted by other nucleotides in Sequence No. 3 through Sequence No. 5 are 
included in the present invention. 

[0015] 

Similarly, proteins with one or more amino acids added, deleted and/or substituted by 
other amino acids are also included in the present invention to the extent that they have 
human galectin-9 activity. 

[0016] 

The cDNA of the present invention also includes cDNA fragments (10 bp or more) with a 
partial base sequence from the base sequence expressed by Sequence No. 3. This 
includes DNA fragments with sense strands and antisense strands. These DNA 
fragments can be used as probes for genetic diagnostics. 

[0017] 

[Working Examples] 

The following is a detailed explanation of the present invention with reference to working 
examples. The present invention is by no means restricted to these working examples. 
Basic manipulation and enzymatic reactions for recombinant DNA is described 
elsewhere ["Molecular Cloning: A Laboratory Manual", Cold Spring Harbor Laboratory, 
1989]. Where unspecified, the restriction enzymes and modification enzymes 
manufactured by Takara Shuzo Co., Ltd. were used. The buffer solution compositions 
and reaction conditions for the various enzymatic reactions are described in the 
accompanying instructions. 

[0018] 

(1)cDNA Cloning 

As a result of large-scale base sequence determination selected from the human 
stomach cancer cell cDNA library (WO 97/03190), clone HP01461 was obtained. This 
clone had a 5' non-translation section at 81 bp, an open reading frame at 1068 bp, a 3' 
non-translated section at 576 bp, and a poly (A) tail at 83 bp (Sequence No. 5). The 
open reading frame encodes proteins consisting of 355 amino acid residues. When a 
protein database was searched using this sequence, there was a high resemblance to 
human galectin-9 and mouse galectin-9. A comparison of the amino acid sequences in 
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the human galectin-9-like protein of the present invention (HS) and human galectm-9 
(G9) is shown in Chart 1, and a comparison of the amino acid sequences in the human 
aalectin-9-like protein of the present invention (HS) and the mouse galectm-9 isoform 
(MM) is shown in Chart 2. Here, a dash (-) indicates a cap, an asterisk (*) indicates an 
amino acid sequence identical to one in the protein of the present invention, and a 
period ( ) indicates an amino acid residue resembling one in the protein of the present 
invention. The comparison of the protein of the present invention to human galectin-9 
indicated only six differences: the 88th lysine (arginine in G9), the insertion of a 96th 
glycine, the 135th serine (phenylalanine in G9), the insertion of 32 amino acid residues 
between the 149th and 180th places, a 270th proline (leucine in G9), and a 313th 
glutamic acid (glycine in G9). When the protein of the present invention was compared 
to the mouse galectin-9 isoform, there was a 69.3% resemblance with the protein of the 
present invention longer by a mere two amino acid residues. As a result, the protein of 
the present invention can be considered a homologue of the mouse galectin-9 isoform. 

[0019] 

Chart 1 

(Omitted) 

[0020] 

Chart 2 

(Omitted) 

[0021] 

(2) Protein Synthesis Using In Vitro Translation 

In vitro translation was performed with a T N T rabbit reticulocyte solution kit (Promega 
Co Ltd) using vector pHP01461 with the cDNA of the present invention. [ S] 
methionine was added at this time, and the expressed product was labeled with a radio 
isotope The reaction was performed according to the accompanying protocols. Next, 
100 al of the reaction solution containing 2 ng of plasmid pHP01416, 50 nl of T N T rabbit 
reticulocyte solution, 4 nl of buffer solution (accompanying the kit), 2 ^l of amino acid 
mixed solution (methionine-free), 8 \i\ of [ 35 S] methionine (Amarsham Co., Ltd.) (0.37 
MBq/nl) 2 nl of T7RNA polymerase, and 80 U of RNasin was reacted for 90 minutes at 
30°C Then 2 ul of SDS sampling buffer (125 mM tris-hydrochloric acid, pH 6.8, 120 
mM 2-mercaptoethanol, 2% SDS solution, 0.025% bromophenol blue, 20% glycerol) was 
added to 3 ul of the reaction solution. After heat processing at 95°C for three minutes, 
SDS-polyacrylamide gel electrophoresis was performed. Autoradiography was then 
performed. Next, the molecular weight of the translated product was measured. As a 
result it was determined that the cDNA of the present invention produced a translated 
product with an approximate molecular weight at 40 kDa (FIG 2). This value matched 
the molecular weights of 39 and 517 anticipated based on the base sequence expressed 
by Sequence No. 2. This indicates that the cDNA encodes proteins expressed by 
Sequence No. 2. 
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[0022] 



(3) Measurement of Lactose Bonding Activity in In Vitro Translation Product 

After 100 ml of Sepharose 4B gel suspension (Pharmacia Corp.) was thoroughly rinsed 
in 0 5 M sodium carbonate, it was suspended in 100 ml of 0.5 M sodium carbonate. 
Then 10 ml of vinylsulfone was added and stirred in gently for an hour at room 
temperature. After rinsing the gel with 0.5 M sodium carbonate, it was suspended in a 
10% lactose, 0.5 M sodium carbonate solution, and stirred gently overnight at room 
temperature The gel was then rinsed successively in 0.5 M sodium carbonate, water, 
and 0 05 M phosphoric acid buffer solution (pH 7.0). The lactose-fixed Sepharose 4B 
gel obtained in this manner was stored at 4°C in a 0.05 M phosphoric acid buffer solution 
containing 0.02% sodium azide (pH 7.0). 

[0023] 

Next 100 ^l of the in vitro translation reaction solution was placed in a lactose-fixed 
Sepharose 4B column (bed capacity 4.5 ml) prepared beforehand. After being rinsed in 
20 ml of lactose column buffer solution (20 mM trishydrochloric acid buffer solution, pH 
7 5 2 mM EDTA, 150 mM NaCI, 4 mM 2-mercaptoethanol, 0.01% Triton X-100), the 20 
ml of buffer solution containing 0.3 M lactose was extracted. Because the eluted 
fractions contain 40 kDa translation product, the proteins of the present invention clearly 
have lactose bonding capacity (FIG 2). 

[0024] 

(4) Galectin-4-Like Protein Expression and Lactose Bonding Activity in E.coli 

After 1 \ig of plasmid pHP01461 had been consumed by 20 units of EcoRI and 20 units 
of Notl, 0.8% agarose gel electrophoresis was performed, and 1.7 kbp DNA fragments 
were extracted. After 1 ug of E.coli expressing vector pET21a (Novagen Corp.) had 
been consumed by 20 units of EcoRI and 20 units of Notl, 0.8% agarose gel 
electrophoresis was performed, and 5.3 kbp DNA fragments were extracted. After 
connecting the DNA fragments using a ligation kit, E.coli JM109 was transformed. 
Plasmid pET-1461 was prepared from the transformant, and the target recombinant was 
confirmed using a restricted enzyme fraction map. 

[0025] 

Two oligonucleotide primers, PR1 (5•-CGCATATGGCCTTCAGCGGTTCCCAGGC-3•) 
and PR2 (5'-AACGGCACCGTGGAGAAGGCAGGCTGAACA-3') were synthesized using 
a DNA automatic synthesizer (Applied Biosystems Corp.) according to the 
accompanying protocols. The cDNA 5' translated sections were amplified with a PCR kit 
(Takara Shuzo Co., Ltd.) using 1 ng plasmid pHP01461 and 100 pmole primer PR1 and 
primer PR2. After phenol extraction and ethanol precipitation, this was consumed by 20 
units of Sad and Ndel. Next, 1.2% agarose gel electrophoresis was performed on the 
reaction product, 320 bp DNA fragments were broken, and the fragments were refined. 
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[0026] 



After 1 iig of plasmid pET-1461 had been consumed by 20 units of Sad and Ndel, 0.8% 
agarose gel electrophoresis was performed, and 3.8 kbp DNA fragments were extracted. 
After the DNA fragments and 320 bp DNA fragments prepared beforehand using PCR 
were connected using a ligation kit, E.coli BL21 (DE3) was transformed. Plasmid 
pET1461 was prepared from the transformant, and the target recombinant was 
confirmed using a restricted enzyme fraction map. 



[0027] 

Next, 2 ml of pET1461/BL21 (DE3) overnight cultivation solution was suspended in a 
200 ml LB culture containing 100 ug/ml ampicillin, shaken and cultivated at 37°C. When 
Aeoo was 0.5, 1 mM of isopropylthiogalactocide was added. After cultivation at 37°C for 
another three hours, the solution was centrifuged and the bacteria were suspended in 25 
ml of lactose column buffering solution. After ultrasound processing was performed on 
the solution, the centrifuged supernatant was added to a lactose-fixed Sepharose 4B 
column (bed capacity 2 ml). After rinsing using 10 ml of lactose column buffering 
solution 10 ml, 5 ml of buffering solution containing 0.3 M lactose was extracted. When 
SDS-polyacrylamide electrophoresis was performed on the extracted proteins, a single 
band at 40 kDa was observed. The molecular weight matched the anticipated molecular 
weight of human galectin-9-like proteins. In other words, the human galectin-9-hke 
proteins expressed by E.coli had lactose bonding capability. 

[0028] 

(5) Northern Blot Hybridization 

In order to investigate the expression pattern in human tissue, northern blot hybridization 
was performed. Filters plotting poly (A) +RNA isolated from various human tissues were 
purchased from Clonetech Corp. After plasmid pHP01049 was consumed by ApaLI and 
BstXI agarose gel electrophoresis was performed, and cDNA fragments were isolated. 
These were tagged with [ 32 P] dCTP (Amarsham Corp.) using a random primer labeling 
kit (Takara Shuzo Co., Ltd.). In the case of the insertion section, synthetic 
oligonucleotide 5'- AACG GCACCGTGG AG AAGGCAGGCTG AG CA-3' was tagged with a 
terminal 32 P using T4 polynucelotidekinase. Hybridization was performed using the 
solution accompanying the blot paper according to the protocols. 



[0029] 

When the cDNA was probed, the strongest expression was in the peripheral blood. 
Expression also occurred in the heart, placenta, lungs, spleen, thymus gland, ovaries, 
small intestines and large intestines. The size of the transferred products was 
approximately 2 k (FIG 3). When the inserted section was used as the probe, the results 
were different (FIG 4). The 2 k band was the strongest, indicating the small intestines 
and the large intestines. The lung and the peripheral blood had the weakest expression. 
As for the size of the other bands, there was a strong band of less than 1 k in the liver, 
and a 2 4 k band in the kidneys. When the inserted section was used as the probe, the 
expression pattern differed from the human galectin-9 expression pattern. This indicates 
that the proteins of the present invention received expression controls different from 
those of human galectin-9. This means the function is probably also different. 
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[0030] 



[Effect of the Invention] 

The oresent invention provides human cDNA for encoding galectin-9-like proteins and 
1S3 by thehuman cDNA. Recombinant proteins can be expressed m large 
Sl«rtite^no the cDNA of the present invention. These recomb.nant proteins can be 
used as medicines or as reagents in research. 



[0031] 

[Sequence Chart] 

Sequence No: 
Sequence Length: 
Sequence Type: 
Topology: 

Sequence Category: 
Hypothetical: 
Origin: 

Product Name 
Cell Category: 
Clone Name: 
Sequence 

[0032] 

[Sequence Chart] 



1 

32 

Amino Acid 
Straight Chain 
Protein 
No 



Homo Sapiens 
Stomach Cancer Tissue 
HP01461 



Sequence No: 
Sequence Length: 
Sequence Type: 
Topology: 

Sequence Category: 
Hypothetical: 
Origin: 

Product Name 
Cell Category: 
Clone Name: 
Sequence 

[0033] 

[Sequence Chart] 

Sequence No: 
Sequence Length: 
Sequence Type: 
Number of Chains: 
Typology: 



2 

355 

Amino Acid 
Straight Chain 
Protein 
No 



Homo Sapiens 
Stomach Cancer Tissue 
HP01461 



3 

96 

Nucleic Acid 
Double Stranded 
Straight Chain 
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Sequence Category: cDNA to mRNA 
Origin: 

Product Name: Homo Sapiens 
Cell Category: Stomach Cancer Tissue 

Clone Name: HP01461 
Sequence 



[0034] 



[Sequence Chart] 

Sequence No: 
Sequence Length: 
Sequence Type: 
Number of Chains: 
Topology: 

Sequence Category 
Origin: 

Product Name 
Cell Category: 
Clone Name: 
Sequence 



4 

1065 

Nucleic Acid 
Double Stranded 
Straight Chain 
cDNA to mRNA 



Homo Sapiens 
Stomach Cancer Tissue 
HP01461 
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[Sequence Chart] 



Sequence No: 
Sequence Length: 
Sequence Type: 
Number of Chains: 
Topology: 

Sequence Category 
Origin: 

Product Name: 
Cell Category: 
Clone Name: 

Sequence Characteristics 
Indicator Code: 
Position: 

Sequence 



5 

1725 

Nucleic Acid 
Double Stranded 
Straight Chain 
cDNA to mRNA 



Homo Sapiens 
Stomach Cancer Tissue 
HP01461 

CDS 
82.. 1149 



[0036] 

[Brief Explanation of the Drawings] 

[FIG 1] A diagram indicating the structure of plasmid pHP01461 . 

[FIG 2] A diagram of the results from an analysis of (1) an in vitro-translated human 
galectin-9-like protein and (2) a lactose column-bonded human galactin-9-like protein 
SDS-PAGE. 



[FIG 3] A diagram of the results from northern blot hybridization using a cDNA fragment 
as a probe. 

[FIG 4] A diagram of the results from northern blot hybridization using an oligonucleotide 
insertion as a probe. 
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[Document Name] Drawings 
[FIG 1] 
[FIG 2] 
[FIG 3] 

I ... Heart 
2 ... Brain 

3 ... Placenta 
4 ... Lungs 
5 ... Liver 

6 ... Skeletal Muscle 

7 ... Kidneys 
8 ... Pancreas 

9 ... Spleen 

10 ...Thymus Gland 

II ... Prostate Gland 
12 ... Testicles 

13 ... Ovaries 

14 ... Small Intestines 

15 ... Large Intestines 

16 ... Peripheral Blood White Blood Cells 

[FIG 4] 

1 ... Heart 

2 ... Brain 

3 ... Placenta 

4 ... Lungs 

5 ... Liver 

6 ... Skeletal Muscle 

7 ... Kidneys 

8 ... Pancreas 

9 ... Spleen 

10 ... Thymus Gland 

11 ... Prostate Gland 

12 ... Testicles 

13 ... Ovaries 

14 ... Small Intestines 

15 ... Large Intestines 

16 ... Peripheral Blood White Blood Cells 



[Document Name] Abstract 

Purpose] To provide lactose-bonding human galectin-9-like proteins expressed 
specifically in the gastrointestinal tract, and human cDNA encoding these proteins. 

[Constitution] Human cDNA encoding galectin-9-like protein is cloned, the human cDNA 
is expressed by encoding E.coli proteins, and expressed product lactose bonding activity 
is measured. 



[Selected Drawing] None 
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229 

f*3&;r|»ffi«llB( TffH;*$B 4T|4#lf 

0427(42)4791 
596134998 

1 ffiI2E# 2002-30738. 0 5 



9-226468 

153 

03(3792)1019 

[^®J&ifeg#-if] 011501 
[ffltt&m.] 21.000R 

[*£M5#C7)@^] — - . ; — 

1 

mmw i 



ffiSE^F 2002-3073805 



#5p 9-226468 

mmm 

mW<D£W H h^U^^> - 9«gajC«J:t5^n&3 - Kt5 cDNA 
[#fFff^(Z>^B] 

m^m i ] ib#i## i ts^tis75;iffi?ij^.tfiei. 

[ffif#^3] @B^J##3 •e^StlSffiHiB^J&^tJ c DNA 0 

[ff#«4] BB^!I#-^4*??^3tlSI6SIB^I&^tf, »#S3ffiiOcDN 

A 0 ._ ., .. „ .. - .... - - - - 

[if 3*3® 5] IB^J##5T?^3*iSt6£IB^e>&£, it#S3.&«vxttft 
#£4 3E«0>c DNA„ 

[M<z>¥N«I&H,l?] 

[0 0 0 1] 

#ffl*S. #»Wtf> tfcDN A tt, Jt^^Wrffl^n-^ate^iKM 

m^mt^xm^^zt^x^ 5. icDNA^a-Kitv^ia 
k & * * £M -r § «> © st & *m t. l t m v * & r h # xg z> o 

[0 0 0 2] 

&*Efc£?lCW#UTV*.5£#;t.&*lTV*S [Drickamer, K. , 
Annu. Rev. Cell Biol. 9:237-264 (1993)] 

. ztiitx. jtf\sZ?-y - ifrbiJU??->-9$.x. QmWkDJtfi/trf-yifito 

[0 0 0 3] 



£BliE# 2002-3073805 



9 — 226468 

SBfC£ IsTmfeZtlfc ]s9^y"1*%>& [Tureci, O. , J. Biol. 
Chem. 272: 6416-6422 (1997) ] „ :# U # ^ > - 9 te, 

v>)^T'it f^StfMfcSz: 9 #38£3;flTV*S# [W a 

da, J. and K a n w a r , Y. S . , J. Biol. Chem. 2 
72: 6. 0 78-6086 (1997)], t hTftt, * t'f ffli^^r^f V7 

[0 0 0 4] 

N A £ li^-r -5 Z. £ T? & £ . 
[0 0 0 5] 

##5^*3tlS«[a6BB^JS:^tfcDNAS:ai«'rs. 
[0 0 0 6] 

*^^(DgeS(t UNGUIS, M«&£fr<b#ilir£;£?£, #3§0J3<Z)7^ 

2 tttlE^ 2002-3073805 



9—226468 

[0 0 0 7] 

[0 0 0 8] 
[0 0 0 9] 

7BftBB3»fS:^tf^^ KKfrtf '(575 yfift^aSJWJb) fc-&**i<6. ziti^oo^^ 

[0 0 1 0] 



3 
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> 

[0011] 

*$&m<D\i hcDNAit H h^lSfi^c DNA7^ -fr^VU-yjt 
•tZZLttfm&Z. Z.(Dc DN A7^f^7'J -iitl hijM^JfffffiLfe jKU ( 

V (A) + RNAl:M^fc 0 cDNAffl^lCfefcotlt (530J-B e r g& [ 
Okayama, H. and Berg, P. , Mol. Cell. Bio 
- 1 . 2 : 1 6 1 - 1 7 0 ( 1 9 8 2) ] , G u b 1 e r - H o f f m a n & [ - 
Gubler, U. and Hoffman, J. Gene 2 5:2 6 3 
-269 (1983)] Jfcif V^&£#&£MVNT : k cfcV^, ^fi?D->?: 
3WlftlC*#S£#){Cte:> ^t'ytf>^ [Kato, S. el a '1 . , G e 
ne 150:243-250 (1994)] £J§ W£ ZL H tfM£ LV^ c DN 

^ * h - * £ (Dffi&m & 31 JC J: o t^tS; o . 

[0012] 

*m%<D cDNAH BB#I##3 3b£V^«#J##4 T?*3*l&*6MB#J&-£ 

M/U£, fi^5Tf*3HS%(!!)tt, 172 
5 b p^e>*,-6a3IBB^JS:^L/, 1 0 6 8 b p ©a*- :/> U >£*:7 

*5gfiR«:3-.KbtV>5. Zl<Z)gBfftt7^ 7 ®?®Ijrj UW7 if U V 
7-y-QTJV7*-<U£.Q 9. 3%t^oM^MU^mLX^^ 0 
[0013] 

IB?'J##3 lCfH«©c DN A^fcMB^JtcS-^T^JSl/fcrt-y =^ ? 
K:/a-:/£Mv^, t: hMfl&fr e>fESg L £ t: h cDNA^^f^'J-^ 
j*? y -^^^-TSZlillCj: »J, #$&t80> c DN Atm— W n 



2002-3073. 8 05 



4$ 5p 9-226468 

[0014] 

IZm&Zl? KlCi:Se&#fc3*lTVV$ c D N A ^^©^BStCteV^ 

a 

[0 0 15] 
[0 0 16] 

c DNAWitf (1 0 b pJgJLt) *j^£t\Z> 0 "fe> X g| 7 > 

[0 0 1-7] 
[HS6M3 

#C tcHifeWc J: »J 38 £ JUfrlft fCs^^ * t>\ it Z. ft <d lc|&5£ $ ft 

["Molecular Cloning. A Laboratory Man 
u a 1 " , Cold Spring Harbor Laboratory, 1 

9 8 9] (ctfcofco ®m&mi3£z$&m&ffi&mte. mznm<Dm^m^ %m 

[0 0 18] 

(1) cDNA^D-->i'' 
t MSilScDNA7-<y7'J- (WO 9 7/0 3 1 9 OfB«) *»&3S3Rl, 
c D N A ^ D - >©^IIfi WJ^^ffllS^, ^/n->HP0146U# 
t=. 0 *VU-y\$, 8 1 bp© 5' immW®., 1 0 6 8 bp(Z)^-y>'J-f 
>f >^7l/-A, 5 7 6 bp©3' #SJIRfI^ 83bp©/t('J (A) ^r — JVfr 



&SE# 2002-3073805 



#32 9 — 226468 

£&3itLfc£;r 3, t:h#l/?^> - 9 ft^mcv^tfl/?^:/ 

0th^b^f>iSei (HS) £ t h#l/?^>- 9 (G9) ©T^y^ESB 
#J©ifc«&£:, ^21C, *«93©t (H S) 

9 7-f V (MM) ©7 5 y|fcffl#|0>jfc«*&a*'*. -tt^Y*;^ 

©6ffiffitc&v*#38«>k*ifc 0 -r&fc*., 88#g©'jj;> (G9-ej±7;i/=^ 

>) , 9 6#g(Dyj yy<DffiA. 1 3 5#Ifflt U > (G9T'ii7i-;i/77 
-» , 1 4 9#@*» & 1 8 0#S*-e©3 2 75 ;»SSffl#A, 2 7 0#@ 

<z>y n u > (g 9 T-ttD-r >>>) , 3 i zmuwfr* ^ ym (g 9 Tii^y s/ 

y) T*2bZ> 0 *W5aRIV?X5!/b^f>- 9 7>f V 7 * - A iiJfclRUfe 

*!8«©saa©**<2r5yK«s«v^ew-e, ^fi^icfc^oT 6 9 

[0 0 1 9 ] 

* 1 



HS MAFSGSQAPYLSPAVPFSGTIQGGLQDGLQITVNGTVLSSSGTRFAVNFQTGFSGNDIAF 

G9 MAFSGSQAPY LSPA VPFSGT I QGGLQDGLQ I TVNGTVLSSSGTRFAVNFQTGFSGND I AF 
HS HFNPRFEDGGYVVCNTRQNGSWGPEERKTHMPFQKGMPFDLCFLVQSSDFKVMVNG I LFV 

G9 HFNPRFEDGGYVVCNTRQNGSWGPEERRTHMPFQK-MPFDLCFLVQSSDFKVMVNG I LFV 
HS QYFHRVPFHRVDT I SVNGSVQLSY I SFQNPRTVPVQPAFSTVPFSQPVCFPPRPRGRRQK 

3§C 2jC 3|C 5(C 3(C 5(C 3fC 3iC 2fC 5|C 2jC «fC *(C 3(C 3(C 3fC JfC 3fC «{C 2jC 3fC 2{C ?K 2jC 3{C 5|C 

G9 QYFHRVPFHRVDT I FVNGSVQLSY I SFQ 



6 
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3fi¥ 9-226468 

HS PPGVWPANPAP I TQTV I HTVQSAPGQMFSTPA I PPMMYPHPA YPMPF I TT I LGGLYPSKS 

G9 PPGVWPANPAP I TQTV I HTVQSAPGQMFSTPA I PPMMYPHPA YPMPF I TT I LGGLYPSKS 
HS ILLSGTVLPSAQRFHINLCSGNHIAFHLNPRFDENAVVRNTQIDNSWGSEERSLPRKMPF 

G9 I LLSGTVLPSAQRFH I NLCSGNH I AFHLNLRFDENA VVRNTQ I DNSWGSEERSLPRKMPF 
HS VRGQSFSVWILCEAHCLKVAVDGQHLFEYYHRLRNLPTINRLEVGGDIQLTHVQT 

- G9 VRGQSFSVW I LCGAHCLKVAVDGQHLFEY YHRLRNLPT I NRLEVGGD I QLTHVQT 



[0 0 2 0] 

f§2 



HS MAFSGSQAPYLSPAVPFSGT I QGGLQDGLQ I TVNGTVLSSSGTRFA VNFQTGFSGND I AF 

MM MALFSAQSPY INPII PFTGP I QGGLQEGLQVTLQGTT-KSFAQRFV VNFQNSFNGND I AF 
HS HFNPRFEDGGYVVCNTRQNGSWGPEERKTHMPFQKGMPFDLCFLVQSSDFKVMVNG I LFV 

MM HFNPRFEEGGYVVCNTKQNGQWGPEERKMQMPFQKGMPFELCFLVQRSEFKVMVNKKFFV 
HS QYFHR VPFHRVDT I SVNGSVQLSY I SFQNPRTVPVQPAFSTVPFSQPVCFPPRPRGRRQK 

MM QYQHR VP YHLVDT I A VSGCLKLSF I TFQN S-A AP VQHVFSTLQFSQPVQFPRTPKGRKQK 
HS PPGVWPANPAP I TQTV I HTVQSAPGQMFSTPA I PPMMYPHPA YPMPF I TT I LGGLYPSKS 

MM TQNFRPAHQAPMAQTT I HMVHSTPGQMFSTPG I PPVVYPTPA YT I PFYTP I PNGLYPSKS 
HS I LLSGTVLPSAQRFH INLCSGNH I AFHLNPRFDENA VVRNTQ I DNSWGSEERSLPRKMPF 

MM I M I SGNVLPDATRFH I NLRCGGD I AFHLNPRFNENA VVRNTQ I NNSWGQEERSLLGRMPF 
HS VRGQSFSVW I LCEAHCLKVAVDGQHLFEY YHRLRNLPT I NRLEVGGD I QLTHVQT 



7 
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#i£ 9 — 226468 

*********.**.**.****.***. ****** ^ **, . ** *** . ********** 

MM SRGQSFSVWI ICEGHCFKVAVNGQHMCEYYHRLKNLQDINTLEVAGDIQLTHVQT 



[0 0 2 1 ] 

(2) >f >^hn»9?K:j:SSee^ 

*$&m<Z>c DNA^ffS^^^!- p HP 0 1 4 6 1 &M^T, T N T9iJ-3r 
OIK C 35 S] *^*-->£SSflIIU Ml#^7W^Vh-^7^Hfc 
HP 0 1 4 1 6 2 ^ g T N T «7+f 50^1, M« ( 

' 3fy MctfB) 4 ^ 1. .7 3 ^Kffl^* (*^;i-->£^i;&v%) 2 # 1 % [ 35 
S] %=f-Hc^-y 8 /tt 1 (0. 3 7MBq//tl) V T7RN 

AsitV * 9-1? 2 fi 1 , RNasin 8 0U?:^tJlil 0 0 /i KDSMt 
" .1?3 0TCt9 O^rM^tfco MM3/i 1}CSDSD->^U>^^77- 
v (12 5mMhntii«, pH6.'8, 1 2 0 mM 2 - * y h X # 7 
2%SDSM, 0. 0 2 5%^D ; E7i;-Jl/yjb- 
2 /£ 1 &fltl>t. 9 5iC3»Wlfet, SDS-/K'J7^U;i7 5 

' #3&t8©c DNAli. ^i»4 0kDa©»|RIfi|S:4l«lfc 

(0 2) . BE^I##2"e*S4xS*6SIB?aA^ : ?«S*i*SeR<Z>^ 
i^i3 9, 5 1 7fc-gtl, i©cDNA*«?a^fCBB^J##2 7**S*l*ai 

'[0 0 2 2] 

t7rn-X4B^l/W (77;i/7i/7t) lOOmUO. 5M^Kt 
h'J7At+Wbfct, 10 Oml COO. 5M^ithy7Ai:!ISbfe. 
iftlCfcf— ;i/*;i/3fc> 1 0ml &a&inu ^ffiT? 1 l$ISfjft^jWc!B#Cfc. ^ 
£0. 5M&gft^ h U tf*T?8fc?*Ufc«, 10%7^h-^ 0. 5M^ttb 

U^AS?»[jc!»«8b^fi-?f— iftlg^^icSt^Lfco >f;b£0. 5M»h'J7 

8 HJSE# 2002-3073805 



9 — 226468 
A, 0. 0 5M'J>iW (pH7„ 0) 

N - ^i^ftt 7 7 D - ^ 4 B 0. 0 2%T*Jit± V U ? A 

0 5MU>K«IS (pH7. 0) tf», 4 B Ct'«#bfe 0 
[0 0 2 3] 

>r >tr hn»sR&js?Ri ooai zmzmm^t^v h-^m^t^yTn- 

(^-yF§i4. 5ml) &C^tf\ 9 # h - * # 9 AJ§# 9 Aj^ffi 
^ (2 0mMbiJ^IfI, pH7. 5, 2 mM EDTA, 1 5 0 mM 
NaCl, 4mM2-^;i/*^hx3!;-;K 0. 01%Triton X-l 
0 0) 2 0 mlt«. 0. 3M5^ h-^$:^tf*7Aif 12 0 m 1 Tfig 
fflbfco ^©^m, $SEttiS#K:4 0 kD a<Dj§SIRjg»^£;nT^£- 
*»W©SaHB5^ h-*|g^fB£^*S;ifc#j*3*ifc (02) „ 
[0 0 2 4] 

(4) *JlM^cj:£:ffl/#^>-4aigB£©$8^fc9# h-x^tt 

^5^U*pHP0 14 6 1. 1 a* g 2 0#g©EcoRIi:2 0ffi(5 
NotlHSftLfel, 0. 8%T^n-^>f;i/«^g&{C^^, jjl&l . 7kb 
pfflDNAi^^^ffJifJbfc. O^T% Xmmmftm'*? Z-P ET2 
la (N o v a g e n*±$g) Ug?:2 0M(DEcoRIi:2 0f|ifflNot 
IT'MttLfc^ 0. 8%y#n-*^;i/*£C8ci&lC;frtt, *&5. 3kbp<Z)D 

-14 6 1 «tfRB»3R-WWT«SBrtC «fc y BWii'rSjlHSI^&flftSOfe. 

[0 0 2 5] 

2*©t'J ^?^l/tf F^7>f7-PR1 (5' — CGCATATGGCC 
TTCAGCGGTTCCCAGGC-3' ) ilPR2 (5' — AACGGCA 
CCGTGGAGAAGGCAGGCTGAACA — 3' ) &DNA@|^i 

:7°^*^ FpHP0146Ulng t^7^7-PR 1, PR 2^-*l-6*l 1 0 
0 p mo 1 e trfS^X. PCR^-yh (SS1|±) iZ J: *J c D N ACD 5 ' fflft3R 
fKfgc£igipgLfc 0 7x;-« ^-;i/tfcJRft, 2 0»SacIi:N 



2002-3073805 



4$ 9-226468 
del T'M-ftU BL^mm^ 1 . 2 %7 iSU -XtfJim^LfaWjlZfrtt. ,ftj3 2 0 
[0 0 2 6] 

2Of&0SacIilNdeIT'i 
ftL£^> 0. 8%T#n-xy;b«^<ifrlCfr#, 3. 8kbp(ODNA»ijt 

&^;i/^e>w*;ffibfe. z<ddn AmftnftizpcRiz£^rmmhfcffi3 2 0 

bpfflDNAilt?:, 9-*y-S/H>*<i/ MC«fc»;$£*g|fe, ^I|BL21 (D 
E 3) ^ffiibfc. ^5t*E&##e>:7^xs F P E T 1 4 6 1 £»SU fill 

[0 0 2 7] 

PET1461/BL21 (DE3) 0-H^ii2ml 5:1 00/tg/ml 
7>^^'J>MLB§S2 0 0ml(CilU 3 7 °C T»Jg £ ? ig# U A 600 
#$jo. 5 (c&o£££lc>f y t?;i/^;fr#5^ hi/ l mMtc&£ J; 3 (c 

^7Affljj7AW2 5ml(CMlfc 0 i©M?:SfiM, 
> ±S&5feJcaw$3bfc9^ h-^H^b-fe7rn-^4 B*^A K»;F8«2 
ml) KfrV* 5^ h - X*7A^*7AM 10ml t«t, 0. 3 M5 
? b-XZ&tilJ^&MffimSm 1 7??£ffiU£ 0 Mbt^fegaiS: SDS- 
tfVT? U;i/7 ^ K*^CSc|&fC**Wfci:^5, 4 0 k D a CDfifBtC^— CZ>^> K 

[0 0 2 8] 

(5) ;if>^D»;hA>fy»J^>f^-Va> 

t7^;^-5:^n->f>y^tti)^iALfe. ^7^U*pHP0 1 04 9 & 
ApaLI^BstXI -eMftLfeU, TJJU-Xtf)im%,foMlZfrV}r c DN 



1 0 
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#:qz 9 — 226468 

»J, C 32 P] dCTP (TV-'i/t^tfc) T5«g»Lfc. ffAgPr^tCo^T 
it. =f3? ^ 1/^5=- K5' -AACGGCACCGTGGAGAAGGC 

AGGCTGAGCA-3' T 4 jK U * * I/**" K3f"*— If T'*^ 32 P 

[0 0 2 9] 

^w, Mil. «n mm. mm. mm, *»Tf$^#sg«>£4xfc. 

TMv^^i^cW:, (04). »2k©;t>K^ 

[0 0 3 0] 

[0 0 3 1 ] 
- [IB#J*] 

ffi^JS-^ : 1 

mm<D-&2 : 3 2 



ffifiE# 2002-30738 0. 5 



9-2 2 6 4 6 8 



mm : 

£n->£ : HP 0 1 4 6 1 

Asn Pro Arg Thr Val Pro Val Gin Pro Ala Phe Ser Thr Val Pro Phe 
.1 5 10 15 

Ser Gin Pro Val Cys Phe Pro Pro Arg Pro Arg Gly Arg Arg Gin Lys — ^„ 

20 25 30 

[0 0 3 2] 

mm^ : 2 
mnoy^ : 3 55 
mn&w. : r ^ j m 

/W/K-fe^ : N o 

mm-. . 

9U-y& : H P 0 1 4 6 1 

mm 

Met Ala Phe Ser Gly Ser Gin Ala Pro Tyr Leu Ser Pro Ala Val Pro 

1 . 5 10 15 

Phe Ser Gly Thr He Gin Gly Gly Leu Gin Asp Gly Leu Gin He Thr 

20 25 30 

Val Asn Gly Thr Val Leu Ser Ser Ser Gly Thr Arg Phe Ala Val Asn 

35 40 45 

Phe Gin Thr Gly Phe Ser Gly Asn Asp He Ala Phe His Phe Asn Pro 

12 ffilE4f 2 0 0 2 - 3 0 7 3 8 0 5 



#i£ 9 — 226468 



50 55 60 

Arg Phe Glu Asp Gly Gly Tyr Val Val Cys Asn Thr Arg Gin Asn Gly 
65 70 75 80 

Ser Trp Gly Pro Glu Glu Arg Lys Thr His Met Pro Phe Gin Lys Gly 

85 90 95 

Met Pro Phe Asp Leu Cys Phe Leu Val Gin Ser Ser Asp Phe Lys Val 

100 105 110 

Met Val Asn Gly lie Leu Phe Val Gin Tyr Phe His Arg Val Pro Phe 

• --— 115 - - 120 - - .125- - - 

His Arg Val Asp Thr He Ser Val Asn Gly Ser Val Gin Leu Ser Tyr 

130 135 140 

He Ser Phe Gin Asn Pro Arg Thr Val Pro Val Gin Pro Ala Phe Ser 
145 150 155 160 

Thr Val Pro Phe Ser Gin Pro Val Cys Phe Pro Pro Arg Pro Arg Gly 

165 170 175 

Arg Arg Gin Lys Pro Pro Gly Val Trp Pro Ala Asn Pro Ala Pro lie 

180 185 190 

Thr Gin Thr Val lie His Thr Val Gin Ser Ala Pro Gly Gin Met Phe 

195 200 205 

Ser Thr Pro Ala He Pro Pro Met Met Tyr Pro His Pro Ala Tyr Pro 

210 215 220 

Met Pro Phe He Thr Thr He Leu Gly Gly Leu Tyr Pro Ser Lys Ser 
225 230 235 240 

He Leu Leu Ser Gly Thr Val Leu Pro Ser Ala Gin Arg Phe His He 

245 250 255 

Asn Leu Cys Ser Gly Asn His lie Ala Phe His Leu Asn Pro Arg Phe 

260 265 270 

Asp Glu Asn Ala Val Val Arg Asn Thr Gin He Asp Asn Ser Trp Gly 
275 280 285 



1 3 
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Ser Glu Glu Arg Ser Leu Pro Arg Lys Met Pro Phe Val Arg Gly Gin 

290 295 300 

Ser Phe Ser Val Trp He Leu Cys Glu Ala His Cys Leu Lys Val Ala 
305 310 315 320 

Val Asp Gly Gin His Leu Phe Glu Tyr Tyr His Arg Leu Arg Asn Leu 

325 330 335 

Pro Thr He Asn Arg Leu Glu Val Gly Gly Asp He Gin Leu Thr His 
340 345 350 

Val Gin Thr 
355 
[0 0 3 3] 

m^m^r : 3 

BB^JOSS : 96 

mwom -.mm 

M&MD&M : c DN A to m R N A 
MM: 

? U-y% : H P 0 1 4 6 1 

mm 

AACCCCCGCA CAGTCCCTGT TCAGCCTGCC TTCTCCACGG TGCCGTTCTC CCAGCCTGTC 60 
TGTTTCCCAC CCAGGCCCAG GGGGCGCAGA CAAAAA 96 

[0 0 3 4] 
I23»# : 4 
Um<D-g.iZ : 1 0 6 5 

mom : 



miE# 2002-3073805 
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i2J(I©^^:cDNA to mRNA 
MM : 

V U — y%j : H P 0 1 4 6 1 



kTCCCCTTC A 
AiUbLLI 1LA 


GCGGTTCCCA 


GGCTCCCTAC 


PTP APTPP AP 
blbAblbbAb 


PTPTPPPPTT 
blblbbbbl 1 


TTPTPPP APT 
1 IblbbbAbl 


bU 


AT'TP 4 AW JP 
Al iLAALliAU 


GTCTCCAGGA 


CGGACTTCAG 


A TP APTPTP A 
AlbAblblbA 


KTCCC KCCCT 
AibbbAbbb 1 


TPTP APPTPP 

IblbAbLIbb 


1 OA 


KCTrr a app a 
AblbUAALLA 


GGTTTGCTGT GAACTTTCAG 


Ablbbbl IbA 


CTCC AAA TP A 
blbbAAAlbA 


P A TTPPPTTP 

bAl Ibbbl lb 


loU 


LAL1 1 LA ALU 


CTCGGTTTGA 


AGATGGAGGG 


T A PPTPPTPT 

1 Abblbblbi 


PP A A P A PP A P 
bbAAbAbbAb 


PP A P A A PPP A 

bbAbAAbbbA 


OA (\ 




PPP KCC AT AT 
bbbAbbAbAb 


P A KC AP KC AP 
bAAbAbAbAb 


A TPPPTTTPP 
Albbbi 1 ibb 


AP A A CCCC A T 
AbAAbbbbAl 


PPPPTTTP AP 
bbbbl 1 IbAb 


Q Ci f\ 


L1L1UL1 ILL 


TGGTGCAGAG 


CTCAGATTTC 


A APPTP ATPP 
AAbblbAlbb 


TP A hCCCC AT 
IbAAbbbbA 1 


PPTPTTPPTP 
bblbi Ibb lb 


OCA 

oOU 


P A PT A PTTPP 

LAU1 AL1 ILL 


ACCGCGTGCC 


CTTCCACCGT 


CTCC A P KCC A 
b IbbAbAbbA 


TPTPPPTPA A 
IblbbblbAA 


TPPPTPTPTP 
Ibbblblblb 


A OC\ 


CAGCTGTCCT 


ACATCAGCTT 


CCAGAACCCC 


CGCACAGTCC 


CTGTTCAGCC 


TGCCTTCTCC 


480 


ACGGTGCCGT 


TCTCCCAGCC 


TGTCTGTTTC 


CCACCCAGGC 


CCAGGGGGCG 


CAGACAAAAA 


540 


CCTCCCGGCG 


TGTGGCCTGC 


CAACCCGGCT 


CCCATTACCC 


AGACAGTCAT 


CCACACAGTG 


600 


CAGAGCGCCC 


CTGGACAGAT 


GTTCTCTACT 


CCCGCCATCC 


CACCTATGAT 


GTACCCCCAC 


660 


CCCGCCTATC 


CGATGCCTTT 


CATCACCACC 


ATTCTGGGAG 


GGCTGTACCC 


ATCCAAGTCC 


720 


ATCCTCCTGT 


CAGGCACTGT 


CCTGCCCAGT 


GCTCAGAGGT 


TCCACATCAA 


CCTGTGCTCT 


780 


GGGAACCACA 


TCGCCTTCCA 


CCTGAACCCC 


CGTTTTGATG 


AGAATGCTGT 


GGTCCGCAAC 


840 


ACCCAGATCG 


ACAACTCCTG 


GGGGTCTGAG 


GAGCGAAGTC 


TGCCCCGAAA 


AATGCCCTTC 


900 


GTCCGTGGCC 


AGAGCTTCTC 


AGTGTGGATC 


TTGTGTGAAG 


CTCACTGCCT 


CAAGGTGGCC 


960 


GTGGATGGTC 


AGCACCTGTT 


TGAATACTAC 


CATCGCCTGA 


GGAACCTGCC 


CACCATCAAC 


1020 


AGACTGGAAG 


TGGGGGGCGA 


CATCCAGCTG 


ACCCATGTGC 


AGACA 




1065 
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TTTCTTTGTT AAGTCGTTCC CTCTACAAAG GACTTCCTAG TGGGTGTGAA AGGCAGCGGT 60 

GGCCACAGAG GCGGCGGAGA G ATG GCC TTC AGC GGT TCC CAG GCT CCC TAC 111 

Met Ala Phe Ser Gly Ser Gin Ala Pro Tyr 

1 5 10 

CTG AGT CCA GCT GTC CCC TTT TCT GGG ACT ATT CAA GGA GGT CTC CAG 159 

Leu Ser Pro Ala Val Pro Phe Ser Gly Thr He Gin Gly Gly Leu Gin 

15 20 25 

GAC GGA CTT CAG ATC ACT GTC AAT GGG ACC GTT CTC AGC TCC AGT GGA 207 

Asp Gly Leu Gin He Thr Val Asn Gly Thr Val Leu Ser Ser Ser Gly 

30 35 40 

ACC AGG TTT GCT GTG AAC TTT CAG ACT GGC TTC AGT GGA AAT GAC ATT 255 

Thr Arg Phe Ala Val Asn Phe Gin Thr Gly Phe Ser Gly Asn Asp He 

45 50 55 

GCC TTC CAC TTC AAC CCT CGG TTT GAA GAT GGA GGG TAC GTG GTG TGC 303 

Ala Phe His Phe Asn Pro Arg Phe Glu Asp Gly Gly Tyr Val Val Cys 

60 65 70 

AAC ACG AGG CAG AAC GGA AGC TGG GGG CCC GAG GAG AGG A AG ACA CAC 351 

Asn Thr Arg Gin Asn Gly Ser Trp Gly Pro Glu Glu Arg Lys Thr His 
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75 80 85 90 

ATG CCT TTC CAG AAG GGG ATG CCC TTT GAC CTC TGC TTC CTG GTG CAG 399 

Met Pro Phe Gin Lys Gly Met Pro Phe Asp Leu Cys Phe Leu Val Gin 

95 100 105 

AGC TCA GAT TTC AAG GTG ATG GTG AAC GGG ATC CTC TTC GTG CAG TAC 447 

Ser Ser Asp Phe Lys Val Met Val Asn Gly He Leu Phe Val Gin Tyr 

110 115 120 

TTC CAC CGC GTG CCC TTC CAC CGT GTG GAC ACC ATC TCC GTC AAT GGC 495 

Phe His Arg Val Pro Phe His Arg Val Asp Thr lie Ser Val Asn Gly 

125 130 135 

TCT GTG CAG CTG TCC TAC ATC AGC TTC CAG AAC CCC CGC ACA GTC CCT 543 

Ser Val Gin Leu Ser Tyr lie Ser Phe Gin Asn Pro Arg Thr Val Pro 

140 145 150 

GTT CAG CCT GCC TTC TCC ACG GTG CCG TTC TCC CAG CCT GTC TGT TTC 591 

Val Gin Pro Ala Phe Ser Thr Val Pro Phe Ser Gin Pro Val Cys Phe 
155 160 165 170 

CCA CCC AGG CCC AGG GGG CGC AGA CAA AAA CCT CCC GGC GTG TGG CCT 639 

Pro Pro Arg Pro Arg Gly Arg Arg Gin Lys Pro Pro Gly Val Trp Pro 

175 180 185 

GCC AAC CCG GCT CCC ATT ACC CAG ACA GTC ATC CAC ACA GTG CAG AGC 687 

Ala Asn Pro Ala Pro He Thr Gin Thr Val He His Thr Val Gin Ser 

190 195 200 

GCC CCT GGA CAG ATG TTC TCT ACT CCC GCC ATC CCA CCT ATG ATG TAC 735 

Ala Pro Gly Gin Met Phe Ser Thr Pro Ala He Pro Pro Met Met Tyr 

205 210 215 

CCC CAC CCC GCC TAT CCG ATG CCT TTC ATC ACC ACC ATT CTG GGA GGG 783 

Pro His Pro Ala Tyr Pro Met Pro Phe He Thr Thr He Leu Gly Gly 

220 225 230 

CTG TAC CCA TCC AAG TCC ATC CTC CTG TCA GGC ACT GTC CTG CCC AGT 831 
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Leu Tyr Pro Ser Lys Ser He Leu Leu Ser Gly Thr Val Leu Pro Ser 

235 240 245 250 

GCT CAG AGG TTC CAC ATC AAC CTG TGC TCT GGG AAC CAC ATC GCC TTC 879 

Ala Gin Arg Phe His He Asn Leu Cys Ser Gly Asn His lie Ala Phe 

255 260 ' 265 

CAC CTG AAC CCC CGT TTT GAT GAG AAT GCT GTG GTC CGC AAC ACC CAG 927 
His Leu Asn Pro Arg Phe Asp Glu Asn Ala Val Val Arg Asn Thr Gin 

270 275 280 

ATC GAC AAC TCC TGG GGG TCT GAG GAG CGA AGT CTG CCC CGA AAA ATG 975 
He Asp Asn Ser Trp Gly Ser Glu Glu Arg Ser Leu Pro Arg Lys Met 

285 290 295 

CCC TTC GTC CGT GGC CAG AGC TTC TCA GTG TGG ATC TTG TGT GAA GCT 1023 
Pro Phe Val Arg Gly Gin Ser Phe Ser Val Trp He Leu Cys Glu Ala 

300 305 310 

CAC TGC CTC AAG GTG GCC GTG GAT GGT CAG CAC CTG TTT GAA TAC TAC 1071 
His Cys Leu Lys Val Ala Val Asp Gly Gin His Leu Phe Glu Tyr Tyr 
315 320 325 330 

CAT CGC CTG AGG AAC CTG CCC ACC ATC AAC AG A CTG GAA GTG GGG GGC 1119 
His Arg Leu Arg Asn Leu Pro Thr lie Asn Arg Leu Glu Val Gly Gly 

335 340 345 

GAC ATC CAG CTG ACC CAT GTG CAG ACA TAGGCGGCTT CCTGGCCCTG GGGC 1170 
Asp He Gin Leu Thr His Val Gin Thr 

350 355 
CGGGGGCTGG GGTGTGGGGC AGTCTGGGTC CTCTCATCAT CCCCACTTCC CAGGCCCAGC 1230 
CTTTCCAACC CTGCCTGGGA TCTGGGCTTT AATGCAGAGG CCATGTCCTT GTCTGGTCCT 1290 
GCTTCTGGCT ACAGCCACCC TGGAACGGAG AAGGCAGCTG ACGGGGATTG CCTTCCTCAG 1350 
CCGCAGCAGC ACCTGGGGCT CCAGCTGCTG GAATCCTACC ATCCCAGGAG GCAGGCACAG 1410 
CCAGGGAGAG GGGAGGAGTG GGCAGTGAAG ATGAAGCCCC ATGCTCAGTC CCCTCCCATC 1470 
CCCCACGCAG CTCCACCCCA GTCCCAAGCC ACCAGCTGTC TGCTCCTGGT GGGAGGTGGC 1530 
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CTCCTCAGCC CCTCCTCTCT GACCTTTAAC CTCACTCTCA CCTTGCACCG TGCACCAACC 1590 
CTTCACCCCT CCTGGAAAGC AGGCCTGATG GCTTCCCACT GGCCTCCACC ACCTGACCAG 1650 
AGTGTTCTCT TCAGAGGACT GGCTCCTTTC CCAGTGTCCT TAAAATAAAG AAATGAAAAT 1710 
GCTTGTTGGC ACATT 1725 
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